Reading the data
The data for this analysis was created in folder 2019-03-29.
args <- commandArgs(trailingOnly = TRUE)
if (length(args) == 0) {
stop("At least one argument must be supplied: input file.", call.=FALSE)
} else if (length(args) == 1) {
args[2] <- "."
}
input_file_name <- args[1]
output_dir <- args[2]
counts <- read.table(input_file_name, row.names=1, header=TRUE)
Three factors: population, diapause or hatching condition, and selective regime. I prefer the immediate hatching (instead of forced diapause) and the regular or periodic selective regime as the reference levels of the corresponding factors.
population <- factor(c("1","1","2","2","3","3","4","4","5","5","6","6"))
NestedPop <- factor(c("1","1","2","2","1","1","3","3","2","2","3","3"))
diapause <- factor(rep(c("hatch","diapause"), 6))
diapause <- relevel(diapause, "hatch")
regime <- factor(c("random","random","random","random","regular","regular",
"random","random","regular","regular","regular","regular"))
regime <- relevel(regime, "regular")
combined <- factor(c("RandHatch","RandDiap","RandHatch","RandDiap","RegulHatch","RegulDiap",
"RandHatch","RandDiap","RegulHatch","RegulDiap","RegulHatch","RegulDiap"))
combined <- relevel(combined, "RegulHatch")
y <- DGEList(counts=counts, group=regime)
y <- calcNormFactors(y, method="TMM")
barplot(y$samples$lib.size*1e-6, names=1:12, ylab="Library size (millions)")
It looks like the populations undergoing diapause systematically produced smaller libraries. It may be the effect of a smaller amount of eggs available. Population 5 is the exception.
Filtering
The recommended filtering consists on removing genes that have such a low expression level that do not have at least 5-10 counts on as many samples as groups there are. The rationale is that those genes cannot have enough reads in at least one sample of each group. However, instead of filtering on the raw counts, the edgeR manual suggests to translate whatever minimum number of reads per library in terms of counts per million. The reason is that the filtering should take into account differences in library sizes between samples. It probably makes sense, because otherwise, I would be applying an effectively higher (harsher) minimum threshold to the gene expression level in samples with smaller libraries. That would probably bias results, since we would be enriching the dataset in genes more highly expressed in some samples than in others.
Say that library sizes range between 250000 and 2000000. Then, applying a minimum expression level of 20 counts per million is the only way to make sure that even the small libraries are required to contain at least 5 counts, while at the same time setting the cutoff at the same biologically relevant level. In any case, as long as the minimum expression level is not enforced in all samples, the dataset retained will contain genes that do not reach the minimum in some samples. Since counts per million are (expectedly and actually) similarly distributed among samples, it should not be the case that any sample is enriched in low-expression genes.
threshold <- 5.0 / (min(y$samples$lib.size) / 1000000)
keep <- rowSums(cpm(y) > threshold) >= 4
y <- y[keep, ,keep.lib.sizes=FALSE]
plotMDS(y$counts)

The magnitude of variation along the axes seems very high in comparison with the examples available in the EdgeR manual, suggesting wild gene expression differences. With the exception of sample X5A_S2, dimension 1 separates samples undergoing forced diapause (named with a “C”, mostly on the left half of the plot) from those hatching right away (named with an “A”, mostly on the right). The second dimension is also meaningful, since it almost separates populations 1, 2, and 4, which underwent random regime, from 3, 5 and 6 (regular regime). Sample X2A_S7 is the main exception.
layout(mat=matrix(c(1,2,3,4,5,6,7,8,9,10,11,12), nrow=4, ncol=3, byrow=TRUE), width=c(1,1,1), height=c(1,1,1,1))
for (i in 1:12) {
plotMD(cpm(y, log=TRUE), column=i)
}

layout(mat=matrix(c(1), ncol=1, nrow=1))
A lot of genes seem to be differentially expressed. The total number of tags passing the filters is 32925. Let’s define a variable to store this value:
NumOfTags <- dim(y)[1]
Estimating the Biological Coefficient of Variation
I need different models:
- Model 1. To test for an effect of the hatching condition, the most powerful model may be this:
~population + diapause. That is to treet samples as paired by population. By using population, I account for variation due to selective regime.
- Model 2. Another simple model would ignore the blocking effect of the population and the interaction between the two main factors:
~regime + diapause.
- Model 3. If there are reasons to ignore the effect of population and the interaction between selective regime and hatching condition, I would combine both factors in one with four levels, following section 3.3.1 in the edgeR manual:
~0 + combined.
- Model 4. Just to check for a batch effect of the population:
~regime + diapause + regime:NestedPop.
- Model 5. For the sake of comparison:
~regime + diapause + regime:diapause.
- Model 6. And for the sake of completeness:
~regime + diapause + regime:diapause + regime:NestedPop.
design1 <- model.matrix(~population + diapause)
design2 <- model.matrix(~regime + diapause)
design3 <- model.matrix(~0 + combined)
design4 <- model.matrix(~regime + diapause + regime:NestedPop)
design5 <- model.matrix(~regime + diapause + regime:diapause)
design6 <- model.matrix(~regime + diapause + regime:diapause + regime:NestedPop)
y1 <- estimateDisp(y, design1)
y2 <- estimateDisp(y, design2)
y3 <- estimateDisp(y, design3)
y4 <- estimateDisp(y, design4)
y5 <- estimateDisp(y, design5)
y6 <- estimateDisp(y, design6)
layout(mat=matrix(c(1,2,3,4,5,6), ncol=3, nrow=2, byrow=TRUE))
plotBCV(y1, main='Model 1')
plotBCV(y2, main='Model 2')
plotBCV(y3, main='Model 3')
plotBCV(y4, main='Model 4')
plotBCV(y5, main='Model 5')
plotBCV(y6, main='Model 6')

Fitting the models
fit1 <- glmQLFit(y1, design1)
fit2 <- glmQLFit(y2, design2)
fit3 <- glmQLFit(y3, design3)
fit4 <- glmQLFit(y4, design4)
fit5 <- glmQLFit(y5, design5)
fit6 <- glmQLFit(y6, design6)
layout(mat=matrix(c(1,2,3,4,5,6), ncol=3, nrow=2, byrow=TRUE))
plotQLDisp(fit1, main='Model 1')
plotQLDisp(fit2, main='Model 2')
plotQLDisp(fit3, main='Model 3')
plotQLDisp(fit4, main='Model 4')
plotQLDisp(fit5, main='Model 5')
plotQLDisp(fit6, main='Model 6')

layout(mat=matrix(c(1,2,3,4,5,6), ncol=3, nrow=2, byrow=TRUE))
gof(fit1, plot=TRUE, main='Model 1')
gof(fit2, plot=TRUE, main='Model 2')
gof(fit3, plot=TRUE, main='Model 3')
gof(fit4, plot=TRUE, main='Model 4')
gof(fit5, plot=TRUE, main='Model 5')
gof(fit6, plot=TRUE, main='Model 6')

Genes affected by hatching condition
Model 1
Model 1 is expected to be the most sensitive to detect genes the expression of which is affected by enforcing diapause, after accounting for the variation among populations (which includes variation due to selective regime).
qlf1 <- glmQLFTest(fit1)
topTags(qlf1)
## Coefficient: diapausediapause
## logFC logCPM F PValue FDR
## TCONS_00075780 -3.1764026 -0.3164783 87.27115 7.244519e-06 0.2385258
## TCONS_00066966 -2.4700837 -0.3933047 56.25930 4.119853e-05 0.4729734
## TCONS_00065260 -2.0911696 -0.6323556 55.60755 4.309552e-05 0.4729734
## TCONS_00003139 -2.5576009 1.8623899 47.30525 7.993445e-05 0.6579605
## TCONS_00064093 -2.0791238 0.4548827 39.91429 1.506917e-04 0.7157760
## TCONS_00068658 -0.7212474 6.3625181 36.48838 2.092055e-04 0.7157760
## TCONS_00006143 1.6227212 -0.1176734 34.98506 2.435680e-04 0.7157760
## TCONS_00029970 -0.9225731 3.9659054 34.90854 2.454957e-04 0.7157760
## TCONS_00039858 -2.3088341 -1.2649910 34.26405 2.624986e-04 0.7157760
## TCONS_00011735 -1.0684960 3.4450536 34.25053 2.628707e-04 0.7157760
z1 <- topTags(qlf1, n=NumOfTags)$table
z1$logPValue <- -log10(z1$PValue)
ggplot(data=z1, mapping=aes(x=logFC, y=logPValue, color=FDR)) +
geom_point() + theme_grey() + ylab(expression(paste("-log"[10], plain("(p value)")))) +
ggtitle('Effect of enforcing diapause, model 1')

I would say the two first genes (XLOC_052427 and XLOC_045087) are significantly downregulated when enforcing diapause, with respect to the immediate hatching condition.
Model 2
Model 2 is very simple, and may include false positives, because it does not account for any interaction, nor for among-population variance.
qlf2 <- glmQLFTest(fit2, coef=3)
topTags(qlf2)
## Coefficient: diapausediapause
## logFC logCPM F PValue FDR
## TCONS_00075780 -3.1943494 -0.3213804 50.19249 8.523615e-06 0.1296082
## TCONS_00065260 -2.0927782 -0.6357618 49.30690 9.352737e-06 0.1296082
## TCONS_00003139 -2.5584129 1.8612144 47.13744 1.180941e-05 0.1296082
## TCONS_00066966 -2.5057699 -0.3955658 35.64548 4.789109e-05 0.3064585
## TCONS_00057853 -2.2440839 0.5642164 33.65855 6.313391e-05 0.3064585
## TCONS_00039858 -2.2497406 -1.2695655 32.63228 7.317018e-05 0.3064585
## TCONS_00045909 -1.3574387 3.4365378 32.58517 7.367345e-05 0.3064585
## TCONS_00068658 -0.7023335 6.3624992 30.71583 9.729233e-05 0.3064585
## TCONS_00066968 -2.2974892 1.6932906 30.57034 9.947215e-05 0.3064585
## TCONS_00016652 -0.9573694 2.8064409 30.40543 1.020115e-04 0.3064585
z1 <- topTags(qlf2, n=NumOfTags)$table
z1$logPValue <- -log10(z1$PValue)
ggplot(data=z1, mapping=aes(x=logFC, y=logPValue, color=FDR)) +
geom_point() + theme_grey() + ylab(expression(paste("-log"[10], plain("(p value)")))) +
ggtitle('Effect of enforcing diapause, model 2')

According to model 2, there are 0 tags significantly up- or downregulated by hatching condition, at FDR=0.1.
Model 3
In model 3, the effect of diapause may be understood in two different ways. First, we may want to know what genes have an average expression level different between hatching conditions. This is different from testing the effect of hatching condition twice: first among samples in a regular environment and then among those in the stochastic environment. Then, I could select the genes with the same significant tendency in both selective regimes. I will limit this analysis to the first option, comparing only average gene expression level.
qlf3 <- glmQLFTest(fit3, contrast=c(-0.5, 0.5, -0.5, 0.5))
topTags(qlf3)
## Coefficient: -0.5*combinedRegulHatch 0.5*combinedRandDiap -0.5*combinedRandHatch 0.5*combinedRegulDiap
## logFC logCPM F PValue FDR
## TCONS_00003139 -2.5237449 1.8609252 58.51207 6.418491e-06 0.1993512
## TCONS_00065260 -2.1649049 -0.6366429 49.72429 1.431840e-05 0.1993512
## TCONS_00075780 -3.2955584 -0.3226349 46.47588 1.984792e-05 0.1993512
## TCONS_00057853 -2.4091110 0.5634941 44.58239 2.421883e-05 0.1993512
## TCONS_00034103 -1.3499676 4.0916701 40.97554 3.608610e-05 0.2376270
## TCONS_00058280 1.2247852 2.7844586 32.70892 1.011500e-04 0.4377930
## TCONS_00066966 -2.4874088 -0.3961485 32.04266 1.108414e-04 0.4377930
## TCONS_00030747 -1.3986822 0.8720172 31.42397 1.208265e-04 0.4377930
## TCONS_00045909 -1.3593991 3.4364473 30.28123 1.421778e-04 0.4377930
## TCONS_00033760 -0.4651111 5.5828895 28.53918 1.838560e-04 0.4377930
z1 <- topTags(qlf3, n=NumOfTags)$table
z1$logPValue <- -log10(z1$PValue)
ggplot(data=z1, mapping=aes(x=logFC, y=logPValue, color=FDR)) +
geom_point() + theme_grey() + ylab(expression(paste("-log"[10], plain("(p value)")))) +
ggtitle('Effect of enforcing diapause, model 3')

According to model 3, there are 0 tags significantly up- or downregulated by hatching condition at FDR = 0.1.
Model 4
qlf4 <- glmQLFTest(fit4, coef=3)
topTags(qlf4)
## Coefficient: diapausediapause
## logFC logCPM F PValue FDR
## TCONS_00075780 -3.1764026 -0.3164783 87.27115 7.244519e-06 0.2385258
## TCONS_00066966 -2.4700837 -0.3933047 56.25930 4.119853e-05 0.4729734
## TCONS_00065260 -2.0911696 -0.6323556 55.60755 4.309552e-05 0.4729734
## TCONS_00003139 -2.5576009 1.8623899 47.30525 7.993445e-05 0.6579605
## TCONS_00064093 -2.0791238 0.4548827 39.91429 1.506917e-04 0.7157760
## TCONS_00068658 -0.7212474 6.3625181 36.48838 2.092055e-04 0.7157760
## TCONS_00006143 1.6227212 -0.1176734 34.98506 2.435680e-04 0.7157760
## TCONS_00029970 -0.9225731 3.9659054 34.90854 2.454957e-04 0.7157760
## TCONS_00039858 -2.3088341 -1.2649910 34.26405 2.624986e-04 0.7157760
## TCONS_00011735 -1.0684960 3.4450536 34.25053 2.628707e-04 0.7157760
z1 <- topTags(qlf4, n=NumOfTags)$table
z1$logPValue <- -log10(z1$PValue)
ggplot(data=z1, mapping=aes(x=logFC, y=logPValue, color=FDR)) +
geom_point() + theme_grey() + ylab(expression(paste("-log"[10], plain("(p value)")))) +
ggtitle('Effect of enforcing diapause, model 4')

According to model 4, there are 0 tags significantly up- or downregulated by hatching condition at FDR = 0.1.
Model 5
qlf5 <- glmQLFTest(fit5, coef=3)
topTags(qlf5)
## Coefficient: diapausediapause
## logFC logCPM F PValue FDR
## TCONS_00003139 -3.1667715 1.8609252 45.62175 2.169430e-05 0.5594549
## TCONS_00073422 -2.3699781 0.3573781 41.50302 3.398359e-05 0.5594549
## TCONS_00061427 2.7235326 -1.1544340 34.65022 7.808930e-05 0.8570301
## TCONS_00012716 1.8955543 0.4594967 26.79424 2.406782e-04 0.9035254
## TCONS_00075780 -3.0370409 -0.3226349 26.60638 2.479447e-04 0.9035254
## TCONS_00065474 1.6624489 -0.8320652 24.61866 3.429358e-04 0.9035254
## TCONS_00046899 -0.9014119 4.7544705 23.40107 4.221982e-04 0.9035254
## TCONS_00065260 -1.8688083 -0.6366429 23.01763 4.514729e-04 0.9035254
## TCONS_00031359 1.8597135 0.8252685 22.20952 5.213270e-04 0.9035254
## TCONS_00012517 -1.6034290 -1.0525963 21.23805 6.227523e-04 0.9035254
z1 <- topTags(qlf5, n=NumOfTags)$table
z1$logPValue <- -log10(z1$PValue)
ggplot(data=z1, mapping=aes(x=logFC, y=logPValue, color=FDR)) +
geom_point() + theme_grey() + ylab(expression(paste("-log"[10], plain("(p value)")))) +
ggtitle('Effect of enforcing diapause, model 5')

According to model 5, there are 0 tags significantly up- or downregulated by hatching condition at FDR = 0.1.
Model 6
qlf6 <- glmQLFTest(fit6, coef=3)
topTags(qlf6)
## Coefficient: diapausediapause
## logFC logCPM F PValue FDR
## TCONS_00003139 -3.183579 1.8617487 57.11781 8.549787e-05 0.9482227
## TCONS_00061427 2.664631 -1.1550477 55.46156 9.448442e-05 0.9482227
## TCONS_00075780 -3.040333 -0.3191142 48.27772 1.506918e-04 0.9482227
## TCONS_00073422 -2.327502 0.3569712 41.37848 2.508920e-04 0.9482227
## TCONS_00066966 -2.586594 -0.3945174 31.76613 5.860462e-04 0.9482227
## TCONS_00043129 1.827538 -0.9309368 30.44470 6.694247e-04 0.9482227
## TCONS_00011325 2.325938 -1.7953157 30.14818 6.901582e-04 0.9482227
## TCONS_00070211 1.195713 2.9754349 28.76705 7.981708e-04 0.9482227
## TCONS_00002778 -2.097683 1.3614324 27.60294 9.063003e-04 0.9482227
## TCONS_00065260 -1.865695 -0.6341799 27.17919 9.502242e-04 0.9482227
z1 <- topTags(qlf6, n=NumOfTags)$table
z1$logPValue <- -log10(z1$PValue)
ggplot(data=z1, mapping=aes(x=logFC, y=logPValue, color=FDR)) +
geom_point() + theme_grey() + ylab(expression(paste("-log"[10], plain("(p value)")))) +
ggtitle('Effect of enforcing diapause, model 6')

According to model 6, there are 0 tags significantly up- or downregulated by hatching condition at FDR = 0.10. And this is how the results from models 2, 4, 5 and 6 match:
significant <- cbind(model2 = topTags(qlf2, n=NumOfTags)$table$FDR <= 0.1,
model3 = topTags(qlf3, n=NumOfTags)$table$FDR <= 0.1,
model4 = topTags(qlf4, n=NumOfTags)$table$FDR <= 0.1,
model5 = topTags(qlf5, n=NumOfTags)$table$FDR <= 0.1,
model6 = topTags(qlf6, n=NumOfTags)$table$FDR <= 0.1)
vennDiagram(vennCounts(significant))

Interaction between selective regime and hatching condition
Model 6
qlf6 <- glmQLFTest(fit6, coef=4)
topTags(qlf6)
## Coefficient: regimerandom:diapausediapause
## logFC logCPM F PValue FDR
## TCONS_00070211 -1.853096 2.9754349 34.99662 0.0004312171 0.9996751
## TCONS_00061427 -3.100062 -1.1550477 31.25595 0.0006165894 0.9996751
## TCONS_00063626 -2.465284 2.1452446 27.99160 0.0008682485 0.9996751
## TCONS_00009010 2.645302 -1.5778409 27.21139 0.0009467938 0.9996751
## TCONS_00027579 -1.234561 4.1322298 26.79929 0.0009919147 0.9996751
## TCONS_00051951 1.298783 4.4154118 26.16535 0.0010667718 0.9996751
## TCONS_00011325 -3.269384 -1.7953157 25.93350 0.0010959269 0.9996751
## TCONS_00043129 -2.692850 -0.9309368 25.32647 0.0011771753 0.9996751
## TCONS_00021566 2.137028 1.2545410 24.21196 0.0013472362 0.9996751
## TCONS_00073422 2.397669 0.3569712 24.20681 0.0013480916 0.9996751
z1 <- topTags(qlf6, n=NumOfTags)$table
z1$logPValue <- -log10(z1$PValue)
ggplot(data=z1, mapping=aes(x=logFC, y=logPValue, color=FDR)) +
geom_point() + theme_grey() + ylab(expression(paste("-log"[10], plain("(p value)")))) +
ggtitle('Interaction regime-hatching')

According to model 6, there are 0 tags with a significant interaction term between selective regime and hatching condition at FDR = 0.25.
Model 5
qlf5 <- glmQLFTest(fit5, coef=4)
topTags(qlf5)
## Coefficient: regimerandom:diapausediapause
## logFC logCPM F PValue FDR
## TCONS_00027579 -1.240476 4.1322047 26.30317 0.0002602219 0.9999678
## TCONS_00073422 2.445762 0.3573781 24.34939 0.0003588465 0.9999678
## TCONS_00034103 -2.065226 4.0916701 24.28128 0.0003630074 0.9999678
## TCONS_00065474 -2.189305 -0.8320652 21.76299 0.0005653343 0.9999678
## TCONS_00009010 2.659012 -1.5757685 21.30288 0.0006153032 0.9999678
## TCONS_00061427 -3.208731 -1.1544340 20.99548 0.0006515772 0.9999678
## TCONS_00032972 2.543394 1.0476130 17.93183 0.0011919608 0.9999678
## TCONS_00035262 -2.523975 -1.5226597 16.29847 0.0016909075 0.9999678
## TCONS_00073421 1.904264 0.6925947 16.07411 0.0017771151 0.9999678
## TCONS_00051238 2.803414 -0.1322084 15.97409 0.0018172001 0.9999678
z1 <- topTags(qlf5, n=NumOfTags)$table
z1$logPValue <- -log10(z1$PValue)
ggplot(data=z1, mapping=aes(x=logFC, y=logPValue, color=FDR)) +
geom_point() + theme_grey() + ylab(expression(paste("-log"[10], plain("p value)")))) +
ggtitle('Interaction regime-hatching in model 5')

According to model 5, there are 0 tags with a significant interaction term between selective regime and hatching condition at FDR = 0.25. And this is how models 5 and 6 agree on what genes have such significant interaction term:
significant <- cbind(model5 = topTags(qlf5, n=NumOfTags)$table$FDR <= 0.25,
model6 = topTags(qlf6, n=NumOfTags)$table$FDR <= 0.25)
vennDiagram(vennCounts(significant))

Interaction between selective regime and population
When testing for significant coefficients showing the effect of the nested population, there are not one but 4 possible coefficients, each with a different fold change. For the volcano plot, I want to use for each gene the fold change that is most extreme (either positive or negative).
Model 6
qlf6 <- glmQLFTest(fit6, coef=5:8)
topTags(qlf6)
## Coefficient: regimeregular:NestedPop2 regimerandom:NestedPop2 regimeregular:NestedPop3 regimerandom:NestedPop3
## logFC.regimeregular.NestedPop2
## TCONS_00026548 -1.1413506
## TCONS_00030123 -2.6906255
## TCONS_00063463 -0.2974970
## TCONS_00025027 -0.8438688
## TCONS_00003241 -1.1571433
## TCONS_00033569 2.3233280
## TCONS_00045925 0.3229189
## TCONS_00030498 -1.0749831
## TCONS_00013983 0.2598862
## TCONS_00064052 0.1377394
## logFC.regimerandom.NestedPop2
## TCONS_00026548 -6.032047768
## TCONS_00030123 3.047168752
## TCONS_00063463 5.254675866
## TCONS_00025027 -1.866311560
## TCONS_00003241 -6.303036583
## TCONS_00033569 6.038682387
## TCONS_00045925 -0.008467676
## TCONS_00030498 2.485216101
## TCONS_00013983 -0.173188313
## TCONS_00064052 -4.154779242
## logFC.regimeregular.NestedPop3
## TCONS_00026548 -0.27140452
## TCONS_00030123 -0.87207561
## TCONS_00063463 -0.01528423
## TCONS_00025027 -1.57951581
## TCONS_00003241 -0.12955714
## TCONS_00033569 1.06772906
## TCONS_00045925 0.37486312
## TCONS_00030498 -0.34625570
## TCONS_00013983 -0.71391219
## TCONS_00064052 0.09669042
## logFC.regimerandom.NestedPop3 logCPM F
## TCONS_00026548 -2.3310170 3.974369 130.77984
## TCONS_00030123 -0.1547378 4.215993 99.96267
## TCONS_00063463 5.5669019 5.712106 75.24105
## TCONS_00025027 7.9373826 1.856160 72.27695
## TCONS_00003241 -5.1072724 2.915626 70.27872
## TCONS_00033569 0.3436243 1.246665 68.95986
## TCONS_00045925 -3.8559283 4.119313 66.51301
## TCONS_00030498 2.2910225 4.776605 61.90379
## TCONS_00013983 1.9734611 5.267264 59.97537
## TCONS_00064052 -0.2325656 3.659203 52.44031
## PValue FDR
## TCONS_00026548 4.622482e-07 0.01521952
## TCONS_00030123 1.258659e-06 0.02072067
## TCONS_00063463 3.605330e-06 0.02670425
## TCONS_00025027 4.181313e-06 0.02670425
## TCONS_00003241 4.636239e-06 0.02670425
## TCONS_00033569 4.971081e-06 0.02670425
## TCONS_00045925 5.677442e-06 0.02670425
## TCONS_00030498 7.390019e-06 0.03035915
## TCONS_00013983 8.298630e-06 0.03035915
## TCONS_00064052 1.355099e-05 0.04261569
z1 <- topTags(qlf6, n=NumOfTags)$table
z1$logPValue <- -log10(z1$PValue)
z1$logFC <- z1[cbind(1:NumOfTags, apply(abs(z1[,1:4]), 1, which.max))]
ggplot(data=z1, mapping=aes(x=logFC, y=logPValue, color=FDR)) +
geom_point() + theme_grey() + ylab(expression(paste("-log"[10], plain("p value)")))) +
ggtitle('Interaction regime-population in complete model')

According to model 6, there are 0 tags with a significant interaction term between selective regime and population, at FDR = 0.01.
Model 4
qlf4 <- glmQLFTest(fit4, coef=4:7)
topTags(qlf4)
## Coefficient: regimeregular:NestedPop2 regimerandom:NestedPop2 regimeregular:NestedPop3 regimerandom:NestedPop3
## logFC.regimeregular.NestedPop2
## TCONS_00026548 -1.1409526
## TCONS_00030123 -2.6894792
## TCONS_00063463 -0.2951510
## TCONS_00033569 2.3243806
## TCONS_00003241 -1.1463190
## TCONS_00025027 -0.8621010
## TCONS_00030498 -1.0706625
## TCONS_00013983 0.2617711
## TCONS_00045925 0.3280177
## TCONS_00064052 0.1330492
## logFC.regimerandom.NestedPop2
## TCONS_00026548 -6.03305847
## TCONS_00030123 3.04802029
## TCONS_00063463 5.24098752
## TCONS_00033569 6.03890907
## TCONS_00003241 -6.30693601
## TCONS_00025027 -1.87059247
## TCONS_00030498 2.48166748
## TCONS_00013983 -0.17693865
## TCONS_00045925 -0.02808466
## TCONS_00064052 -4.14807641
## logFC.regimeregular.NestedPop3
## TCONS_00026548 -0.27173460
## TCONS_00030123 -0.86878630
## TCONS_00063463 -0.02771677
## TCONS_00033569 1.06681900
## TCONS_00003241 -0.13724639
## TCONS_00025027 -1.54901147
## TCONS_00030498 -0.34445905
## TCONS_00013983 -0.72020809
## TCONS_00045925 0.36919873
## TCONS_00064052 0.09660807
## logFC.regimerandom.NestedPop3 logCPM F
## TCONS_00026548 -2.3291951 3.974440 149.00980
## TCONS_00030123 -0.1557691 4.216033 114.68553
## TCONS_00063463 5.5645213 5.712132 83.08438
## TCONS_00033569 0.3430100 1.246348 79.28693
## TCONS_00003241 -5.1033503 2.915882 78.22682
## TCONS_00025027 7.9009948 1.855765 76.60235
## TCONS_00030498 2.2887331 4.776628 65.90891
## TCONS_00013983 1.9712843 5.267254 62.26239
## TCONS_00045925 -3.8599191 4.119315 59.21081
## TCONS_00064052 -0.2271281 3.659217 59.02239
## PValue FDR
## TCONS_00026548 4.275333e-08 0.001407653
## TCONS_00030123 1.325843e-07 0.002182670
## TCONS_00063463 5.291482e-07 0.004107298
## TCONS_00033569 6.461973e-07 0.004107298
## TCONS_00003241 6.844032e-07 0.004107298
## TCONS_00025027 7.484825e-07 0.004107298
## TCONS_00030498 1.418573e-06 0.006672358
## TCONS_00013983 1.805327e-06 0.007430050
## TCONS_00045925 2.232558e-06 0.007450279
## TCONS_00064052 2.262803e-06 0.007450279
z1 <- topTags(qlf4, n=NumOfTags)$table
z1$logPValue <- -log10(z1$PValue)
z1$logFC <- z1[cbind(1:NumOfTags, apply(abs(z1[,1:4]), 1, which.max))]
ggplot(data=z1, mapping=aes(x=logFC, y=logPValue, color=FDR)) +
geom_point() + theme_grey() + ylab(expression(paste("-log"[10], plain("p value)")))) +
ggtitle('Interaction regime-population in model 4')

According to model 4, there are 13 tags with a significant interaction term between selective regimew and population, at FDR = 0.01. And this is how the genes identified as having such significant coefficients match between results of models 4 and 6:
significant <- cbind(model4 = topTags(qlf4, n=NumOfTags)$table$FDR <= 0.01,
model6 = topTags(qlf6, n=NumOfTags)$table$FDR <= 0.01)
vennDiagram(vennCounts(significant))

Main effect of the selective regime
Model 2
qlf2 <- glmQLFTest(fit2, coef=2)
topTags(qlf2)
## Coefficient: regimerandom
## logFC logCPM F PValue FDR
## TCONS_00039978 -4.6173714 0.22365985 143.91953 2.241241e-08 0.0007379285
## TCONS_00029769 -3.0470230 1.80520388 115.41217 8.260165e-08 0.0012992981
## TCONS_00040399 0.9179518 8.92887368 107.54159 1.247110e-07 0.0012992981
## TCONS_00021223 -2.8668066 -0.01397275 100.46951 1.849392e-07 0.0012992981
## TCONS_00011853 -2.5460496 2.93967442 99.34797 1.973118e-07 0.0012992981
## TCONS_00038529 0.9214999 7.16741454 88.34146 3.865922e-07 0.0021214249
## TCONS_00038865 -0.7311331 6.58265152 82.13746 5.839742e-07 0.0025255438
## TCONS_00003138 -0.9392902 4.42067867 72.58240 1.166432e-06 0.0025255438
## TCONS_00003251 0.7230441 6.33486902 71.10043 1.307644e-06 0.0025255438
## TCONS_00064027 1.0885549 5.50123439 70.94544 1.323522e-06 0.0025255438
z1 <- topTags(qlf2, n=NumOfTags)$table
z1$logPValue <- -log10(z1$PValue)
ggplot(data=z1, mapping=aes(x=logFC, y=logPValue, color=FDR)) +
geom_point() + theme_grey() + ylab(expression(paste("-log"[10], plain("p value)")))) +
ggtitle('Main effect of selective regime')

According to model 2, there are 6829 tags significantly up- or downregulated by selective regime at FDR=0.01.
Model 3
Using model 3 I can look for genes affected by selective regime in two ways. First, I could look for a significant change in average gene expression among samples in the same regime. Alternatively, I could separately look for genes affected by the selective regime among samples in the same hatching condition, and then compare the lists of significant genes. According to Edu, the latter makes more sense biologically, because we expect the selective regime to affect different sets of genes depending on whether eggs are allowed to hatch or forced to undergo diapause.
Effect of selective regime when eggs are forced to undergo diapause
qlf3b <- glmQLFTest(fit3, contrast=c(0, 1, 0, -1))
topTags(qlf3b)
## Coefficient: 1*combinedRandDiap -1*combinedRegulDiap
## logFC logCPM F PValue FDR
## TCONS_00029769 -3.2965960 1.8050729 66.72276 3.309341e-06 0.03606989
## TCONS_00039978 -4.1183780 0.2230642 61.02532 5.198752e-06 0.03606989
## TCONS_00065801 1.8348592 3.2085599 49.36062 1.483840e-05 0.03606989
## TCONS_00015560 -2.8069103 1.0279207 48.49532 1.616705e-05 0.03606989
## TCONS_00022801 -4.4660947 -0.2895002 47.08861 1.863619e-05 0.03606989
## TCONS_00016331 -3.0201067 4.0530687 45.12872 2.285137e-05 0.03606989
## TCONS_00040399 0.8758282 8.9288744 44.65223 2.403879e-05 0.03606989
## TCONS_00021223 -2.8048099 -0.0143081 43.57815 2.699036e-05 0.03606989
## TCONS_00029766 -2.8568208 0.2509729 42.81814 2.933617e-05 0.03606989
## TCONS_00011853 -2.4676627 2.9395772 42.69632 2.973397e-05 0.03606989
z1 <- topTags(qlf3b, n=NumOfTags)$table
z1$logPValue <- -log10(z1$PValue)
ggplot(data=z1, mapping=aes(x=logFC, y=logPValue, color=FDR)) +
geom_point() + theme_grey() + ylab(expression(paste("-log"[10], plain("p value)")))) +
ggtitle('Effect of selective regime, when hatching')

According to model 3, contrast b, there are 5153 tags differentially expressed between selective regimes among samples undergoing diapause, at FDR=0.05. There is a reasonable correlation between the fold change in the two hatching conditions:
z2 <- merge(topTags(qlf3a, n=NumOfTags)$table,
topTags(qlf3b, n=NumOfTags)$table,
by='row.names')
ggplot(data=z2, mapping=aes(x=logFC.x, y=logFC.y, color=FDR.x)) +
geom_point() + geom_smooth(method='lm') + xlab('logFC hatching') + ylab('logFC diapause')

Average effect of selective regime
qlf3c <- glmQLFTest(fit3, contrast=c(-0.5, 0.5, 0.5, -0.5))
topTags(qlf3c)
## Coefficient: -0.5*combinedRegulHatch 0.5*combinedRandDiap 0.5*combinedRandHatch -0.5*combinedRegulDiap
## logFC logCPM F PValue FDR
## TCONS_00039978 -4.6100858 0.2230642 149.80524 4.448198e-08 0.001464569
## TCONS_00029769 -3.0390219 1.8050729 113.46353 2.029559e-07 0.003341162
## TCONS_00040399 0.9179400 8.9288744 97.98003 4.459972e-07 0.003776015
## TCONS_00021223 -2.8652164 -0.0143081 92.45974 6.069402e-07 0.003776015
## TCONS_00011853 -2.5447739 2.9395772 91.32577 6.479151e-07 0.003776015
## TCONS_00044630 -1.8588425 3.7626651 89.16237 7.354132e-07 0.003776015
## TCONS_00038529 0.9214048 7.1674161 86.47510 8.640700e-07 0.003776015
## TCONS_00017559 1.0681899 5.6038378 85.49388 9.174828e-07 0.003776015
## TCONS_00038865 -0.7311071 6.5826486 74.25493 1.911967e-06 0.004235884
## TCONS_00047366 -1.4304939 7.0505776 73.56959 2.005709e-06 0.004235884
z1 <- topTags(qlf3c, n=NumOfTags)$table
z1$logPValue <- -log10(z1$PValue)
ggplot(data=z1, mapping=aes(x=logFC, y=logPValue, color=FDR)) +
geom_point() + theme_grey() + ylab(expression(paste("-log"[10], plain("p value)")))) +
ggtitle('Effect of selective regime, when hatching')

According to model 3, contrast c, there are 5153 tags with a significant change (across selective regimes) of average (across hatching conditions) expression levels, at FDR = 0.05.
Model 4
qlf4 <- glmQLFTest(fit4, coef=2)
topTags(qlf4)
## Coefficient: regimerandom
## logFC logCPM F PValue FDR
## TCONS_00063463 -4.850396 5.7121325 257.42048 7.895530e-08 0.002599603
## TCONS_00030498 -2.058907 4.7766284 144.69230 9.043471e-07 0.011985955
## TCONS_00039988 4.490469 2.8430068 129.23447 1.448063e-06 0.011985955
## TCONS_00063464 -3.536331 3.4154358 129.06126 1.456152e-06 0.011985955
## TCONS_00054330 -2.657064 3.0068386 93.15814 5.560592e-06 0.032604375
## TCONS_00039977 -2.801475 3.9487953 88.76704 6.763264e-06 0.032604375
## TCONS_00032909 5.182826 -0.2673881 88.22861 6.931834e-06 0.032604375
## TCONS_00039966 5.307610 0.0114170 83.68888 8.579315e-06 0.035309245
## TCONS_00021071 -2.389116 5.1511425 75.54121 1.293601e-05 0.042853966
## TCONS_00017572 1.259235 7.1997413 74.94986 1.334797e-05 0.042853966
z1 <- topTags(qlf4, n=NumOfTags)$table
z1$logPValue <- -log10(z1$PValue)
ggplot(data=z1, mapping=aes(x=logFC, y=logPValue, color=FDR)) +
geom_point() + theme_grey() + ylab(expression(paste("-log"[10], plain("p value)")))) +
ggtitle('Main effect of selective regime, model 4')

According to model 4, there are 56 tags significantly up- or downregulated by selective regime at FDR=0.25.
Model 5
qlf5 <- glmQLFTest(fit5, coef=2)
topTags(qlf5)
## Coefficient: regimerandom
## logFC logCPM F PValue FDR
## TCONS_00039978 -5.1017936 0.2230642 89.67057 7.136776e-07 0.02349784
## TCONS_00017559 1.3414218 5.6038378 66.75785 3.300492e-06 0.04048402
## TCONS_00044630 -2.2776795 3.7626651 65.31364 3.688749e-06 0.04048402
## TCONS_00047366 -1.8328896 7.0505776 59.01468 6.149849e-06 0.04953639
## TCONS_00035369 -2.7931628 2.2735303 54.93605 8.781579e-06 0.04953639
## TCONS_00040399 0.9600518 8.9288744 53.51970 9.989789e-06 0.04953639
## TCONS_00038529 1.0179935 7.1674161 52.63780 1.084032e-05 0.04953639
## TCONS_00047368 -1.9161628 5.0787282 51.52604 1.203618e-05 0.04953639
## TCONS_00021223 -2.9256229 -0.0143081 48.97599 1.541258e-05 0.05201875
## TCONS_00011853 -2.6218852 2.9395772 48.72635 1.579917e-05 0.05201875
z1 <- topTags(qlf5, n=NumOfTags)$table
z1$logPValue <- -log10(z1$PValue)
ggplot(data=z1, mapping=aes(x=logFC, y=logPValue, color=FDR)) +
geom_point() + theme_grey() + ylab(expression(paste("-log"[10], plain("p value)")))) +
ggtitle('Main effect of selective regime, model 5')

According to model 5, there are 8 tags significantly up- or downregulated by selective regime at FDR=0.25.
Model 6
qlf6 <- glmQLFTest(fit6, coef=2)
topTags(qlf6)
## Coefficient: regimerandom
## logFC logCPM F PValue FDR
## TCONS_00039988 4.875488 2.8431918 191.89898 1.143071e-06 0.01914951
## TCONS_00063463 -4.759067 5.7121056 190.98820 1.163220e-06 0.01914951
## TCONS_00030498 -1.983166 4.7766054 100.29821 1.199481e-05 0.13164302
## TCONS_00063464 -3.550605 3.4153339 90.87423 1.702825e-05 0.14016382
## TCONS_00039977 -2.963986 3.9487155 78.19477 2.890161e-05 0.19031710
## TCONS_00054330 -2.539632 3.0065718 62.62136 6.243326e-05 0.30936540
## TCONS_00039978 -4.688132 0.2247422 61.67751 6.577245e-05 0.30936540
## TCONS_00039966 5.006618 0.0131614 59.00178 7.654774e-05 0.31504180
## TCONS_00068983 -3.266633 1.9384321 53.38159 1.075323e-04 0.33007703
## TCONS_00036683 -3.306312 1.9123751 52.65146 1.126476e-04 0.33007703
z1 <- topTags(qlf6, n=NumOfTags)$table
z1$logPValue <- -log10(z1$PValue)
ggplot(data=z1, mapping=aes(x=logFC, y=logPValue, color=FDR)) +
geom_point() + theme_grey() + ylab(expression(paste("-log"[10], plain("p value)")))) +
ggtitle('Main effect of selective regime, model 6')

According to model 6, there are 5 tags significantly up- or downregulated by selective regime at FDR=0.25. And this is how the different models agree on what genes are differentially expressed in different selective regimes:
significant <- cbind(model2 = topTags(qlf2, n=NumOfTags)$table$FDR <= 0.01,
model3 = topTags(qlf3a, n=NumOfTags)$table$FDR <= 0.05,
model4 = topTags(qlf4, n=NumOfTags)$table$FDR <= 0.25,
model5 = topTags(qlf5, n=NumOfTags)$table$FDR <= 0.05,
model6 = topTags(qlf6, n=NumOfTags)$table$FDR <= 0.25)
vennDiagram(vennCounts(significant))

Note that I am using a different FDR for each model, to obtain more or less comparable and positive numbers.